604 research outputs found

    Thermodynamics of deterministic finite automata operating locally and periodically

    Get PDF
    Real-world computers have operational constraints that cause nonzero entropy production (EP). In particular, almost all real-world computers are 'periodic', iteratively undergoing the same physical process; and 'local', in that subsystems evolve whilst physically decoupled from the rest of the computer. These constraints are so universal because decomposing a complex computation into small, iterative calculations is what makes computers so powerful. We first derive the nonzero EP caused by the locality and periodicity constraints for deterministic finite automata (DFA), a foundational system of computer science theory. We then relate this minimal EP to the computational characteristics of the DFA. We thus divide the languages recognised by DFA into two classes: those that can be recognised with zero EP, and those that necessarily have non-zero EP. We also demonstrate the thermodynamic advantages of implementing a DFA with a physical process that is agnostic about the inputs that it processes

    TreeGrad: Transferring Tree Ensembles to Neural Networks

    Full text link
    Gradient Boosting Decision Tree (GBDT) are popular machine learning algorithms with implementations such as LightGBM and in popular machine learning toolkits like Scikit-Learn. Many implementations can only produce trees in an offline manner and in a greedy manner. We explore ways to convert existing GBDT implementations to known neural network architectures with minimal performance loss in order to allow decision splits to be updated in an online manner and provide extensions to allow splits points to be altered as a neural architecture search problem. We provide learning bounds for our neural network.Comment: Technical Report on Implementation of Deep Neural Decision Forests Algorithm. To accompany implementation here: https://github.com/chappers/TreeGrad. Update: Please cite as: Siu, C. (2019). "Transferring Tree Ensembles to Neural Networks". International Conference on Neural Information Processing. Springer, 2019. arXiv admin note: text overlap with arXiv:1909.1179

    Automatic Induction of Neural Network Decision Tree Algorithms

    Full text link
    This work presents an approach to automatically induction for non-greedy decision trees constructed from neural network architecture. This construction can be used to transfer weights when growing or pruning a decision tree, allowing non-greedy decision tree algorithms to automatically learn and adapt to the ideal architecture. In this work, we examine the underpinning ideas within ensemble modelling and Bayesian model averaging which allow our neural network to asymptotically approach the ideal architecture through weights transfer. Experimental results demonstrate that this approach improves models over fixed set of hyperparameters for decision tree models and decision forest models.Comment: This is a pre-print of a contribution "Chapman Siu, Automatic Induction of Neural Network Decision Tree Algorithms." To appear in Computing Conference 2019 Proceedings. Advances in Intelligent Systems and Computing. Implementation: https://github.com/chappers/automatic-induction-neural-decision-tre

    An Exact No Free Lunch Theorem for Community Detection

    Full text link
    A precondition for a No Free Lunch theorem is evaluation with a loss function which does not assume a priori superiority of some outputs over others. A previous result for community detection by Peel et al. (2017) relies on a mismatch between the loss function and the problem domain. The loss function computes an expectation over only a subset of the universe of possible outputs; thus, it is only asymptotically appropriate with respect to the problem size. By using the correct random model for the problem domain, we provide a stronger, exact No Free Lunch theorem for community detection. The claim generalizes to other set-partitioning tasks including core/periphery separation, kk-clustering, and graph partitioning. Finally, we review the literature of proposed evaluation functions and identify functions which (perhaps with slight modifications) are compatible with an exact No Free Lunch theorem

    Predicting Fluid Intelligence of Children using T1-weighted MR Images and a StackNet

    Full text link
    In this work, we utilize T1-weighted MR images and StackNet to predict fluid intelligence in adolescents. Our framework includes feature extraction, feature normalization, feature denoising, feature selection, training a StackNet, and predicting fluid intelligence. The extracted feature is the distribution of different brain tissues in different brain parcellation regions. The proposed StackNet consists of three layers and 11 models. Each layer uses the predictions from all previous layers including the input layer. The proposed StackNet is tested on a public benchmark Adolescent Brain Cognitive Development Neurocognitive Prediction Challenge 2019 and achieves a mean squared error of 82.42 on the combined training and validation set with 10-fold cross-validation. In addition, the proposed StackNet also achieves a mean squared error of 94.25 on the testing data. The source code is available on GitHub.Comment: 8 pages, 2 figures, 3 tables, Accepted by MICCAI ABCD-NP Challenge 2019; Added ND

    DeltaPhish: Detecting Phishing Webpages in Compromised Websites

    Full text link
    The large-scale deployment of modern phishing attacks relies on the automatic exploitation of vulnerable websites in the wild, to maximize profit while hindering attack traceability, detection and blacklisting. To the best of our knowledge, this is the first work that specifically leverages this adversarial behavior for detection purposes. We show that phishing webpages can be accurately detected by highlighting HTML code and visual differences with respect to other (legitimate) pages hosted within a compromised website. Our system, named DeltaPhish, can be installed as part of a web application firewall, to detect the presence of anomalous content on a website after compromise, and eventually prevent access to it. DeltaPhish is also robust against adversarial attempts in which the HTML code of the phishing page is carefully manipulated to evade detection. We empirically evaluate it on more than 5,500 webpages collected in the wild from compromised websites, showing that it is capable of detecting more than 99% of phishing webpages, while only misclassifying less than 1% of legitimate pages. We further show that the detection rate remains higher than 70% even under very sophisticated attacks carefully designed to evade our system.Comment: Preprint version of the work accepted at ESORICS 201

    Decentralized learning with budgeted network load using Gaussian copulas and classifier ensembles

    Get PDF
    We examine a network of learners which address the same classification task but must learn from different data sets. The learners cannot share data but instead share their models. Models are shared only one time so as to preserve the network load. We introduce DELCO (standing for Decentralized Ensemble Learning with COpulas), a new approach allowing to aggregate the predictions of the classifiers trained by each learner. The proposed method aggregates the base classifiers using a probabilistic model relying on Gaussian copulas. Experiments on logistic regressor ensembles demonstrate competing accuracy and increased robustness in case of dependent classifiers. A companion python implementation can be downloaded at https://github.com/john-klein/DELC

    Evidential Bagging: Combining Heterogeneous Classifiers in the Belief Functions Framework

    Get PDF
    International audienceIn machine learning, Ensemble Learning methodologies are known to improve predictive accuracy and robustness. They consist in the learning of many classifiers that produce outputs which are finally combined according to different techniques. Bagging, or Bootstrap Aggre-gating, is one of the most famous Ensemble methodologies and is usually applied to the same classification base algorithm, i.e. the same type of classifier is learnt multiple times on bootstrapped versions of the initial learning dataset. In this paper, we propose a bagging methodology that involves different types of classifier. Classifiers' probabilist outputs are used to build mass functions which are further combined within the belief functions framework. Three different ways of building mass functions are proposed; preliminary experiments on benchmark datasets showing the relevancy of the approach are presented

    Program trace optimization

    Get PDF
    This is the author accepted manuscript. The final version is available from Springer via the DOI in this record.Paper to be presented at the Fifteenth International Conference on Parallel Problem Solving from Nature (PPSN XV), Coimbra, Portugal on 8-12 September 2018.We introduce Program Trace Optimization (PTO), a system for `universal heuristic optimization made easy'. This is achieved by strictly separating the problem from the search algorithm. New problem definitions and new generic search algorithms can be added to PTO easily and independently, and any algorithm can be used on any problem. PTO automatically extracts knowledge from the problem specifi cation and designs search operators for the problem. The operators designed by PTO for standard representations coincide with existing ones, but PTO automatically designs operators for arbitrary representations

    Improving the minimum description length inference of phrase-based translation models

    Get PDF
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-19390-8_25We study the application of minimum description length (MDL) inference to estimate pattern recognition models for machine translation. MDL is a theoretically-sound approach whose empirical results are however below those of the state-of-the-art pipeline of training heuristics. We identify potential limitations of current MDL procedures and provide a practical approach to overcome them. Empirical results support the soundness of the proposed approach.Work supported by the EU 7th Framework Programme (FP7/2007–2013) under the CasMaCat project (grant agreement no 287576), by Spanish MICINN under grant TIN2012-31723, and by the Generalitat Valenciana under grant ALMPR (Prometeo/2009/014).Gonzalez Rubio, J.; Casacuberta Nolla, F. (2015). Improving the minimum description length inference of phrase-based translation models. En Pattern Recognition and Image Analysis: 7th Iberian Conference, IbPRIA 2015, Santiago de Compostela, Spain, June 17-19, 2015, Proceedings. Springer International Publishing. 219-227. https://doi.org/10.1007/978-3-319-19390-8 25S21922
    • …
    corecore